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Abstract 


Both helper dependent expression systems, based on two components, and single genomes constructed by targeted 
recombination, or by using infectious cDNA clones, have been developed. The sequences that regulate transcription 
have been characterized mainly using helper dependent expression systems and it will now be possible to validate 
them using single genomes. The genome of coronaviruses has been engineered by modification of the infectious cDNA 
leading to an efficient (> 20 ug ml~') and stable (> 20 passages) expression of the foreign gene. The possibility of 
engineering the tissue and species tropism to target expression to different organs and animal species, including 
humans, increases the potential of coronaviruses as vectors. Thus, coronaviruses are promising virus vectors for 


vaccine development and, possibly, for gene therapy. © 2001 Elsevier Science B.V. All rights reserved. 
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1. Introduction 


Coronaviruses have several advantages for use 
as vectors over other viral expression systems: (i) 
coronaviruses are single-stranded RNA viruses 
that replicate within the cytoplasm without a 
DNA intermediary, making integration of the 
virus genome into the host cell chromosome un- 
likely (Lai and Cavanagh, 1997); (i1) these viruses 
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have the largest RNA virus genome and, in prin- 
ciple, have room for the insertion of large foreign 
genes (Masters, 1999; Enjuanes et al., 2000a); (ii1) 
a pleiotropic secretory immune response is best 
induced by the stimulation of gut-associated 
lymphoid tissues. Since coronaviruses in general 
infect the mucosal surfaces, both respiratory and 
enteric, they may be used to target the antigen to 
the enteric and respiratory areas to induce a 
strong secretory immune response (Enjuanes and 
Van der Zeijst, 1995); (iv) the tropism of coro- 
naviruses may be modified by the manipulation of 
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the spike (S) protein allowing engineering of the 
tropism of the vector (Ballesteros et al., 1997; 
Leparc-Goffart et al., 1998; Sanchez et al., 1999; 
Kuo et al., 2000); (v) non-pathogenic coronavirus 
strains infecting most species of interest (human, 
porcine, bovine, canine and feline) are available to 
develop expression systems (Sanchez et al., 1992); 
and (vi) infectious coronavirus cDNA clones are 
available to design expression systems (Almazan 
et al., 2000; Yount et al., 2000; Thiel et al., 2001). 

Vectors for the expression of heterologous 
genes have been developed from full-length 
cDNA clones of most of the positive-strand RNA 
viruses. These viruses can be classified according 
to the nature of their genome (one or more RNA 
fragments) and their expression strategy, for in- 
stance, a single mRNA encoding a polyprotein 
that is processed into functional proteins or a 
collection of mRNAs each encoding a protein. 
Expression systems based on positive-strand RNA 
viruses that are transcribed in a single mRNA 
molecule, such as _ picornaviruses (poliovirus) 
(Andino et al, 1994), and more recently 
flaviviruses (Khromykh and Westaway, 1994, 
1997; Chambers et al., 1999) have been developed. 
The alphaviruses (Togaviridae family), encoding a 
full-length mRNA and a subgenomic mRNA, are 
among the most advanced expression systems 
(Liljestr6m, 1994; Frolov et al., 1997; Pushko et 
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al., 1997; Caley et al., 1999). The alphaviruses 
include the Sindbis virus, Semliki Forest virus 
(SFV) and Venezuelan equine encephalitis virus 
(VEEV), and are very efficient at eliciting humoral 
and cellular immune responses. 

Two types of expression systems have been 
developed based on coronavirus genomes (Fig. 1), 
one requires two components (helper dependent 
expression system) and the other a single genome 
that is modified either by targeted recombination 
or by engineering a cDNA encoding an infectious 
RNA. Coronavirus derived expression systems are 
being developed for human, porcine, murine, 
bovine and avian coronaviruses. The first attempt 
to use coronavirus for heterologous gene expres- 
sion was based on the mouse hepatitis virus 
(MHV) by using a helper dependent expression 
system (Lin and Lai, 1993). Group 1 coro- 
naviruses, such as transmissible gastroenteritis 
virus (TGEV), and group 3 coronaviruses, such as 
infectious bronchitis virus (IBV), have also been 
used for foreign gene expression. 

Among the positive-strand RNA viruses, coro- 
naviruses have the largest genome size (around 30 
kb) and, in principle, could have the largest 
cloning capacity (Enjuanes et al., 2000a). This 
review will focus on the description of the advan- 
tages and limitations of these novel coronavirus 
expression systems, the attempts to increase their 
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Fig. 1. Coronavirus derived expression systems: A. Helper dependent expression system based on two components, the helper virus 
and a minigenome carrying the foreign gene (FG). An, poly A. B. Single genome engineered either by targeted recombination or by 
using an infectious coronavirus cDNA clone (pBAC-TGEV"") derived from TGEV genome. 
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expression levels by studying the transcription 
regulatory sequences (TRSs), and the proven pos- 
sibility of modifying their tissue and species- 
specificity. 

Limited progress on the understanding of repli- 
cation and translation regulation in coronaviruses 
has been made. To obtain information on these 
aspects the reader is referred to recent reviews and 
selected papers where this issue has been ad- 
dressed (Luytjes et al., 1988; Lai and Cavanagh, 
1997; Tahara et al., 1998; O’Connor and Brian, 
2000) since translation and replication will not be 
considered within this review. 


2. Pathogenicity of coronaviruses 


Coronaviruses comprise a large family of 
viruses infecting a broad range of vertebrates, 
from mammalian to avian species. Coronaviruses 
are associated mainly with respiratory, enteric, 
hepatic and central nervous system diseases. Nev- 
ertheless, organs such as kidney, heart, and eye 
can also be affected. In humans and fowl, coro- 
naviruses primarily cause upper respiratory tract 
infections, while porcine and bovine  coro- 
naviruses (BCoVs) establish enteric infections that 
result in severe economical loss. 

The human coronaviruses (HCoV) are responsi- 
ble for 10—20% of all common colds (McIntosh et 
al., 1969), and have been implicated in gastroen- 
teritis, high and low respiratory tract infections 
and rare cases of encephalitis. HCoV have also 
been associated with infant necrotizing enterocoli- 
tis (Resta et al., 1985; Luby et al., 1999) and are 
tentative candidates for multiple sclerosis (Talbot, 
1997). However, HCoV have languished at the 
bottom of many lists of human pathogens because 
of the difficulty in isolating and characterizing the 
agents during outbreaks of illness (Denison, 
1999). In addition, infections of man by coro- 
naviruses seem to be ubiquitous, as coronaviruses 
have been identified wherever they have been 
looked for, including North and South America, 
Europe, and Asia and no other human disease has 
been clearly associated to them with the exception 
of the respiratory and enteric infections (Denison, 
1999). 


Table 1 
Coronaviridae family members 


Group 1 
Human coronavirus 229E HCoV-229E 
Porcine enteric (transmisible gastroenteritis PCoV 
virus, TGEV; and porcine epidemic 
diarrhea virus, PEDV) and respiratory 
(PRCoV) coronavirus 


Canine coronavirus CCoV 

Feline coronavirus, including feline FCoV 
infectious peritonitis virus (FIPV) 

Group 2 

Human coronavirus OC43 HCoV-OC43 

Bovine coronavirus BCoV 

Turkey coronavirus BCoV related TCoV-B 

Murine coronaviruses including mouse MCoV 


hepatitis virus (MHV) 
Porcine hemagglutinating encephalomyelitis HEV 
virus 


Rat coronavirus including RtCoV 
sialodacryoadenitis virus (SDAV) 

Group 3 

Avian coronavirus including infectious ACoV 
bronchitis virus (IBV) 

Turkey coronavirus IBV related TCoV-I 

Unclassified coronavirus 

Rabbit coronavirus RbCoV 


Epithelial cells are the main target of coro- 
naviruses. Widely distributed cells such as 
macrophages are also infected by coronaviruses. 
These viruses have relatively restricted host 
ranges, infecting only their natural host and 
closely related animal species. Coronavirus bio- 
logical vectors are not known. 


3. Coronavirus members 


The coronavirus and the torovirus genera form 
the Coronaviridae family, which is closely related 
to the Arteriviridae family. Both families are in- 
cluded in the Nidovirales order (Enjuanes et al., 
2000a,b). Recently, a new group of invertebrate 
viruses, the okaviruses, with a genetic structure 
and replication strategy similar to those of coro- 
naviruses, has been described (Cowley et al., 
2000). The coronaviruses have been classified in 
three groups that comprise the members listed in 
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Table 1. The murine coronaviruses (MCoV) have 
been extremely useful to study gene expression 
using systems based in two components, a helper 
virus and a minigenome in which the heterologous 
gene was inserted. The advanced state of the 
research performed with coronaviruses and ar- 
teriviruses led to the development of single 
genome expression systems based in both virus 
families. Infectious cDNA clones are available for 
porcine (Almazan et al., 2000; Yount et al., 2000) 
and human (Thiel et al., 2001) coronaviruses, and 
for the arteriviruses equine infectious anemia 
virus (van Dinten et al., 1997; de Vries et al., 
2000) and the porcine respiratory and reproduc- 
tive syndrome virus (PRRSV) (Meulenberg et al., 
1998). The availability of these cDNAs and the 
application of target recombination to coro- 
naviruses (Masters, 1999) have been essential for 
the development of vectors based on_ the 
Nidovirales. 


4. Molecular biology of coronavirus 


4.1. The coronavirus genome 


Virions contain a single molecule of linear, 
positive-sense, single-stranded RNA (Fig. 2B). 
The genomic RNA is the largest viral RNA 
genome known ranging from 27.6 to 31.3 kb in 
size. Coronavirus RNA has a 5’ terminal cap 
followed by a leader sequence of 65-98 nucle- 
otides and an untranslated region of 200-400 
nucleotides. At the 3’ end of the genome there is 
an untranslated region of 200-500 nucleotides 
followed by a poly(A) tail. The virion RNA, 
which functions as a mRNA and is infectious, 
contains ~ 7-10 functional genes, four or five of 
which encode structural proteins. The genes are 
arranged in the order 5’-polymerase-(HE)-S-E-M- 
N-3’, with a variable number of other genes that 


Fig. 2. Structure and genome organization of coronaviruses: A. Schematic diagram of virus structure showing the envelope, the core 
and the nucleoprotein structure. S, spike protein; M and M’, M proteins with the amino-terminus facing the external surface of the 
virion and the carboxy-terminus towards the inside or the outside face of the virion, respectively; E, small envelope protein; N, 
nucleocapsid protein; NC, nucleocapsid. Some coronaviruses of group 2 have an additional protein, the haemagglutinin-esterase 
(HE) (not shown). B. Representation of a prototype TGEV coronavirus genome and subgenomic RNAs. Beneath the top bar a set 
of positive- and negative-sense mRNA species synthesized in infected cells is shown. The protein products obtained from each 
positive-sense RNA are indicated. Two products, polyproteins la and 1b, are translated from the genomic RNA by a ribosomal 
frameshifting mechanism. All other proteins are translated from the first open reading frame of each functionally monocistronic 
subgenomic RNA (dark lines). Poly(A) and Poly(U) tails are indicated by AAA or UUU. S, spike protein; E, envelope protein; M, 


membrane protein; N, nucleocapsid protein. 
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are believed to be non-structural and largely non- 
essential, at least in tissue culture. 

About two-thirds of the entire RNA comprises 
the ORFla/b encoding the replicase gene. At the 
overlap between the ORF la and Ib regions, 
there is a specific seven-nucleotide ‘slippery’ se- 
quence and a pseudoknot structure (ribosomal 
frameshifting signal), which are required for the 
translation of ORF 1b. In the 3’ end, one-third of 
the genome comprises the genes encoding the 
structural proteins and the other non-structural 
ones. Organization of the non-structural protein 
genes, which are interspersed between the known 
structural protein genes, varies significantly 
among different coronavirus strains (Enjuanes et 
al., 2000a). A pseudoknot structure is also pre- 
dicted at the 3’ end of the coronaviral RNA 
(Williams et al., 1995; Hsue and Masters, 1997; 
Brian, 2001). 

Coronavirus transcription occurs via an RNA- 
dependent RNA synthesis process in which mR- 
NAs are transcribed from negative-stranded 
templates. Sequences at the 5’ end of each gene 
represent signals for the transcription of subge- 
nomic mRNAs (Lai and Cavanagh, 1997; Sawicki 
and Sawicki, 1998). These sequences, known as 
TRSs include a stretch of a highly conserved 
sequence designed the core sequence (CS), located 
at sites immediately upstream of most of the 
genes. The CS presents some variation in se- 
quence length among the coronaviruses, being 
5'-CUAAAC-3’ for TGEV, or a related sequence, 
depending on the coronavirus (i.e. UCUAAAC 
for MHV). In previous reports the CS has been 
named intergenic sequence (IS). Since often genes 
overlap in the Nidovirales, the acronym IS does 
not seem appropriate in these cases and the 
acronym CS could reflect the nature of the highly 
conserved sequence contained within the TRS. 
Coronavirus mRNAs consist of six to eight types 
of varying sizes, depending on the coronavirus 
strain and the host species. The largest mRNA is 
the genomic RNA which also serves as the 
mRNA for ORF la and 1b and the remainder are 
subgenomic mRNAs. The mRNAs have a nested- 
set structure in relation to the genome structure 
(Fig. 2B). Except for the smallest mRNA, all of 
the mRNAs are structurally polycistronic. In gen- 


eral, only the 5’-most ORF of each mRNA is 
translated. However, there are exceptions: some 
mRNAs, e.g. mRNA 5 of MHV, mRNA 3 of IBV 
and BCoV nucleocapsid mRNA are translated by 
internal initiation into two or three proteins 
(Lapps et al., 1987; Krishnan et al., 1996). 


4.2. Coronavirus proteins 


Coronaviruses are enveloped viruses that con- 
tain a core that includes the ribonucleoprotein 
formed by the RNA and nucleoprotein N (Fig. 
2A). The core is formed by the genomic RNA, the 
N protein and the carboxy-terminus of the mem- 
brane (M) protein. Most of the M protein is 
embedded within the membrane but its carboxy- 
terminus is integrated within the core and seems 
essential to maintain the core structure (Escors et 
al., 2001). The TGEV M protein presents two 
topologies. In one, both the amino- and the car- 
boxy-terminus face the outside of the virion, while 
in the other the carboxy-terminus is inside (Risco 
et al., 1995). In addition, the virus envelope con- 
tains two or three other proteins, the S protein, 
the small membrane protein (E) and, in some 
strains, the hemagglutinin-esterase (HE) (En- 
juanes et al., 2000a). The ratios of S:E:M:N 
proteins vary in different reports. For purified 
TGEV, these ratios have been estimated to be 
20:1:300:140, respectively (Escors et al., 2001). 
The S protein is large, ranging from 1160 to 1452 
amino acids, and in some coronaviruses is cleaved 
into S1 and S82 subunits. The S protein is respon- 
sible for attachment to cells, hemagglutination, 
membrane fusion and induction of neutralizing 
antibodies. 

The replicase gene is predicted to encode a 
protein of ~ 740-800 kDa which is co-transla- 
tionally processed. Several domains within the 
replicase have predicted functions based on re- 
gions of nucleotide homology including two pa- 
pain-like cysteine proteases, a chymotrypsin- 
picornaviral 3C-like protease, a cysteine-rich 
growth factor-related protein, an RNA-dependent 
RNA _ polymerase, a nucleoside triphosphate 
(NTP)-binding/helicase domain and a zinc-finger 
nucleic acid-binding domain (Siddell, 1995; En- 
juanes et al., 2000a; Penzes et al., 2001). 
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Fig. 3. Summary of helper dependent expression systems based on coronavirus derived minigenomes: A—C. Expression modules 
based on MHV minigenomes DIssF and DIssE cloned under the control of T7 bacteriophage polymerase (T7), used to express 
chloramphenicol acetyltransferase (CAT), hemagglutinin-esterase (HE) or interferon-y using either an IRES (A), or TRSs (B-C). 
D-E. Expression modules based on the TGEV derived minigenome M39 used to express the GUS. The minigenome was cloned 
either after T7 (D) or the CMV (E) promoters. F. Expression module based on the IBV derived minigenome CD-61 used to express 


CAT. 
5. Helper dependent expression systems 


The coronaviruses have been classified into 
three groups (1, 2 and 3) (Table 1) based on 
sequence analysis of a number of coronavirus 
genes (Siddell, 1995). The helper dependent ex- 
pression systems have been developed using mem- 
bers of the three groups of coronaviruses (Fig. 3), 
and will be addressed first. 


5.1. Helper dependent expression systems based 
on group 1 coronaviruses 


Group | coronaviruses include porcine, canine, 
feline and HCoV. Nevertheless, expression sys- 
tems have been developed for the porcine and 
HCoV since minigenomes are only available for 
these two coronaviruses. 

Using the TGEV-derived minigenomes (Fig. 
3D-—E) an expression system has been developed 
(Méndez et al., 1995; Izeta et al., 1999). The 
TGEV-derived RNA minigenomes were success- 
fully expressed in vitro using T7 polymerase and 


amplified after in vivo transfection using a helper 
virus. To engineer cDNAs encoding TGEV defec- 
tive RNAs, a deletion mutant of 9.7 kb (DI-C) 
maintaining the cis-signals required for efficient 
and stable replication and packaging by helper 
virus was isolated (Izeta et al., 1999). A collection 
of 14 DI-C RNA deletion mutants (TGEV 
minigenomes) was synthetically generated and 
tested for their ability to be replicated and pack- 
aged. The smallest minigenome (M33) that was 
replicated by the helper virus and efficiently pack- 
aged was 3.3 kb in length. TGEV derived 
minigenomes of 3.3, 3.9 and 5.4 kb (named M33, 
M39, and M54, respectively) were efficiently used 
for the expression of heterologous genes. 

Using M39 minigenome a two step amplifica- 
tion system was developed similarly to the other 
amplification system (Herweijer et al., 1995; 
Dubensky et al., 1996; Berglund et al., 1998), 
based on the cloning of a cDNA copy of the 
minigenome after the immediate-early cy- 
tomegalovirus promoter (CMV). Minigenome 
RNAs are first amplified in the nucleus by the 
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cellular RNA pol IJ and then, the RNAs are 
translocated into the cytoplasm where they are 
amplified by the viral replicase of the helper virus. 
The B-glucuronidase (GUS) and the ORFS of the 
PRRSV, a porcine virus with a high impact on 
animal health (Plana-Duran et al., 1997), have 
been expressed using this vector (Alonso et al., 
2001b). PRRSV ORF5 (603 nt) encodes a surface 
glycoprotein that is the major PRSSV protective 
antigen described (Plana-Duran et al., 1997). 
Maximum expression levels of both GUS and 
PRRSV ORF5 were detected from passages 3 to 
6, although the expression of these genes persisted 
for at least 10 passages in ST cells. 

The HCoV-229E has also been used to express 
new subgenomic mRNAs although until now it 
has not been applied to the expression of a foreign 
protein (Thiel et al., 1998). It was demonstrated 
that a synthetic RNA comprised of 646 nt from 
the 5’ end and 1465 from the 3’ end was amplified 
by the helper virus. Using this minigenome, mR- 
NAs were efficiently expressed under the control 
of the intergenic region of the HCoV-229E nucle- 
ocapsid protein. 


5.2. Helper dependent expression systems based 
on group 2 coronaviruses 


Most of the work has been done with MHV 
defective RNAs (Lin and Lai, 1993; Liao et al., 
1995; Zhang et al., 1997). Three heterologous 
genes have been expressed using the MHV system, 
chloramphenicol acetyltransferase (CAT), HE, 
and interferon-y (Fig. 3A—C). Expression of the 
reporter gene (CAT) was detected only in passages 
0, 1, and 2. The HE was clearly visualized after 
immunoprecipitation only during the first three 
passages (Liao and Lai, 1995) and the synthesized 
protein was incorporated into the virions. When 
virus vectors expressing CAT and HE were inocu- 
lated intracerebrally into mice, HE- or CAT-spe- 
cific subgenomic mRNAs were detected in the 
brains at days | and 2 p.i. but not later, indicating 
that the genes in the defective minigenome (DI) 
vector were expressed only in the early stage of 
viral infection (Zhang et al., 1998). 

A DI RNA of the MHV was also developed as 
a vector for expressing interferon-y (IFN-y). The 


murine IFN-y gene was secreted into culture 
medium as early as 6 h post-transfection and 
reached a peak level at 12 h post-transfection. The 
DI-expressed IFN-y exhibited an antiviral activity 
comparable to that of recombinant IFN-y. No 
inhibition of virus replication was detected when 
the cells were treated with IFN-y produced by the 
DI RNA, but infection of susceptible mice with 
DI RNA producing IFN-y caused significantly 
milder disease, accompanied by less virus replica- 
tion than that caused by virus containing a con- 
trol DI vector (Lai et al., 1997; Zhang et al., 
1997). 


5.3. Helper dependent expression systems based 
on group 3 coronaviruses 


IBV is an avian coronavirus with a single- 
stranded, positive-sense RNA genome of 27 608 nt 
(Boursnell et al., 1987). A defective RNA (CD-61) 
derived from the Beaudette strain of the IBV virus 
was used as an RNA vector for the expression of 
two reporter genes, luciferase and CAT (Fig. 3F) 
(Penzes et al., 1994, 1996). The defective RNA 
efficiently expressed the CAT gene but only mini- 
mum levels of luciferase (Stirrups et al., 2000). 

A helper dependent expression system has re- 
cently been described based on arteriviruses (Mo- 
lenkamp et al., 2000), that belong to the same 
order as coronaviruses. Also, using equine arteritis 
virus (EAV) minigenomes of 3.8 kb, the CAT 
reporter gene has been expressed. The smallest 
defective RNA obtained (3.0 kb) was replicated by 
the helper virus but could not be packaged. 


5.4. Heterologous gene expression levels in helper 
dependent expression systems 


The expression levels have not been quantified 
in terms of protein mass for MHV helper depen- 
dent expression systems. HCoV-229E mRNA ex- 
pression levels using engineered expression 
modules without a heterologous gene, probably 
are high since the abundance of these RNAs seem 
to be higher than that of the viral mRNAs within 
the same cells. Using IBV minigenomes CAT 
expression levels between 1 and 2 p1g/10° cells have 
been described. 
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The highest expression levels (2-8 pg of GUS 
per 10° cells) have been obtained using a two step 
amplification system based on TGEV derived 
minigenomes with optimized TRSs (Izeta et al., 
1999; Alonso et al., 2001a). 


6. Single genome coronavirus vectors 


6.1. Vectors constructed by targeted 
recombination 


Reverse genetics were possible by targeted re- 
combination between a helper virus and either 
non-replicative or replicative coronavirus derived 
RNAs. This approach was initially developed by 
Masters’ group (Masters, 1999). First, the engi- 
neering of a five nucleotide insertion into the 3’ 
untranslated region (3' UTR) of MHV via 
targeted recombination with an in vitro synthe- 
sized RNA was reported (Fig. 4A) (Koetzner et 
al., 1992). This approach was facilitated by the 
availability of an N gene mutant, designated 
Alb4, that was both temperature sensitive and 
thermolabile. Alb4 forms tiny plaques at restric- 
tive temperature that are easily distinguishable 
from wild-type plaques. In addition, incubation of 


A. TEMPERATURE SELECTION 
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Alb4 virions at non-permissive temperature re- 
sults in a 100-fold greater loss of titer than for 
wild-type virions (Koetzner et al., 1992). These 
phenotypic traits allowed the selection of recom- 
binant viruses generated by a single cross-over 
event following cotransfection into mouse cells of 
Alb4 genomic RNA together with a synthetic 
copy of the smallest subgenomic RNA (RNA7) 
tagged with a marker in the 3’ UTR. 

An improvement of the recombination fre- 
quency was obtained between the helper virus and 
replicative defective RNAs as the donor species. 
Whereas, between replication competent MHV 
and non-replicative RNAs a recombination fre- 
quency of the order of 10~° was estimated, the 
use of replicative donor RNA yielded recombi- 
nants at a rate of some three orders of magnitude 
higher (van der Most et al., 1992). This higher 
efficiency made it possible to screen for recombi- 
nants even in the absence of selection. In this 
manner, the transfer of silent mutation in gene la 
of a minigenome to wild-type MHV at a fre- 
quency of about 1% was demonstrated. 

Targeted recombination has been applied to the 
generation of mutants in most of the coronavirus 
genes. Thus, two silent mutations have been cre- 
ated thus far in gene | (van der Most et al., 1992). 
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Fig. 4. Single genome expression based on the engineering of coronavirus minigenomes by targeted recombination: A. Basic scheme 
of targeted recombination in MHV. The black box indicates the approximate location of the N gene region (87 nt) that is deleted 
in the Alb4 mutant. M, insertion of 5 nt used as a genetic marker (Masters, 1999). B. Targeted recombination within the S gene of 
TGEV and a minigenome carrying the information for an S gene with three nucleotide mutations (Sdmar) that allow escape from 
neutralization by two mAbs specific for antigenic sub-sites Ac and Aa of S protein (Sola et al., 2001b). 
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The S protein has also been modified by targeted 
recombination. Changes were introduced by one 
crossover event at the 5’ end of the S gene that 
modified MHV pathogenicity (Leparc-Goffart et 
al., 1998). Targeted recombination mediated by 
two cross-overs allowed the replacement of the S 
gene of a respiratory strain of TGEV by the S 
gene of enteric TGEV strain PUR-C11 leading to 
the isolation of viruses with a modified tropism 
and virulence (Sanchez et al., 1999). In this case, 
the recombinants were selected in vivo using their 
new tropism in piglets. A new strategy for the 
selection of recombinants within the S gene, after 
promoting targeting recombination, was based on 
elimination of the parental replicative TGEV by 
the simultaneous neutralization with two mAbs 
(Fig. 4B) (Sola et al., 2001b). 

Mutations have been created by targeted muta- 
genesis within the E and M genes. These mutants 
provided corroboration for the pivotal role of E 
protein in coronavirus assembly and identified the 
carboxyl terminus of the M molecule as crucial to 
assembly (de Haan et al., 1998; Fisher and Goff, 
1998). Targeted recombination was also used to 
express heterologous genes. For instance, the gene 
encoding green fluorescent protein (GFP) was in- 
serted into MHV between gene S and E by 
targeted recombination, resulting in the creation 
of the largest known RNA viral genome (Fischer 
et al., 1997). 

The frequencies of the targeted recombination 
event for MHV and TGEV were found to be 
higher than the standard prediction for the recom- 
bination frequency of a multiple crossover. This 
frequency was expected to be the product of the 
frequencies of the individual recombination events 
(Peng et al., 1995; de Haan et al., 1998; Hsue and 
Masters, 1998; Masters, 1999; Sola et al., 2001b). 
Nevertheless, the recombinants with several 
crossovers appear to occur more frequently than 
would be expected if each cross-over was an inde- 
pendent event. This suggests that the alignment of 
two templates is the rate-limiting event in recom- 
bination, and once this has been achieved, the 
barrier to multiple crossovers may be only mar- 
ginally higher than that for single crossovers 
(Masters, 1999; Sola et al., 2001b). 


6.2. Coronavirus vectors derived from infectious 
cDNA clones 


The construction of a full-length genomic 
cDNA clone could considerably improve the ge- 
netic manipulation of coronaviruses. Infectious 
cDNA clones have now been constructed for 
members of many positive-stranded RNA virus 
families (Racaniello and _ Baltimore, 1981; 
Ahlquist et al., 1984; Rice et al., 1987, 1989; 
Liljestr6m and Garoff, 1991; Satyanarayana et 
al., 1999), including the Arteriviridae family 
closely related to coronaviruses (van Dinten et al., 
1997; Meulenberg et al., 1998; de Vries et al., 
2000). Negative-stranded RNA virus genomes 
have been generated for Mononegavirales by the 
simultaneous expression of the ribonucleoprotein 
containing the N protein, the polymerase cofactor 
phosphoprotein and the viral RNA polymerase 
(Schnell et al., 1994). Rescue of engineered RNAs 
in negative-strand RNA virus with eight genome 
segments was also possible for influenza virus 
(Palese, 1998; Fodor et al., 1999; Neumann et al., 
1999; Hoffmann et al., 2000a,b). 

The enormous length of the coronavirus 
genome and the instability of plasmids carrying 
coronavirus replicase sequences have, until re- 
cently, hampered the construction of a full-length 
cDNA clone (Masters, 1999). Now, for the first 
time, construction of infectious coronavirus 
cDNA clones is possible (Almazan et al., 2000; 
Yount et al., 2000; Thiel et al., 2001). Construc- 
tion of the TGEV full-length cDNA was started 
from a DI that was stably and efficiently repli- 
cated by the helper virus (Méndez et al., 1996; 
Izeta et al., 1999). Using this DI, the full-length 
genome was completed and the performance of 
the enlarged genome was checked after each step. 
This approach allowed for the identification of a 
cDNA fragment that was toxic to the bacterial 
host. This finding was used to advantage by 
reintroducing the toxic fragment into the cDNA 
in the last cloning step. In order to express the 
long coronavirus genome and to add the 5’ cap, a 
two-step amplification system that couples tran- 
scription in the nucleus from the CMV promoter, 
with a second amplification in the cytoplasm 
driven by the viral polymerase, was used. In addi- 
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tion, to increase viral cDNA stability within bac- 
teria, the cDNA was cloned as a bacterial artifi- 
cial chromosome (BAC), that produces only one, 
maximum two plasmid copies per cell. BACs have 
been useful to stably clone large DNAs from a 
variety of complex genomic sources into bacteria 
(Shizuya et al., 1992), including herpesvirus DNA 
(Messerle et al., 1997). 

A fully functional infectious TGEV cDNA 
clone (pBAC-TGEV"), leading to a virulent virus 
able to infect both the enteric and respiratory 
tract has been engineered using two BAC plas- 
mids (Fig. 5A). One plasmid (pBAC-TGEV4“™) 
contained all virus sequences except for a frag- 
ment of about 5 kb that was included within a 
second BAC (pBAC-TGEV™’) (Almazan et al., 
2000). Using this cDNA the GFP gene of 0.72 kb 
was cloned into the RNA genome by replacing 
the non-essential 3a and 3b genes (Fig. 5B), lead- 
ing to an engineered genome with high expression 
levels (> 20 j1g/10° cells) and stability (> 20 pas- 
sages in cultured cells) (Sola et al., 2001a). Using 
the TGEV derived cDNA expression system, the 
induction of lactogenic immunity in swine has 
been demonstrated. This immune response led to 
the acquisition of immunity by newborn piglets (I. 
Sola and L. Enjuanes, unpublished results). 

These expression levels are similar to those 
described for vectors based on other positive- 
strand RNA viruses such as poliovirus and car- 
diovirus, and one alphavirus, the VEEV (4 
ug/10°). Nevertheless, these expression levels are 
still lower than those described for other al- 
phaviruses such as Sindbis virus (50 pg/10° cells) 
(Frolov et al., 1996; Agapov et al., 1998) and SFV 
(80-300 g/10° cells) (Liljestr6m and Garoff, 
1991; Sjoberg et al., 1994; DiCiommo and Brem- 
ner, 1998). 

DNA based expression systems in general pro- 
duce high levels of the foreign protein. For in- 
stance, vectors based on adenovirus 5 using the 
major later protein promoter and a tripartite 
leader may express 90 pg/10° cells and baculovirus 
expression systems 15-100 ug/10° cells (Kuroda 
et al., 1989; Sibilia et al., 1995). 

A second procedure to assemble a full-length 
infectious construct of TGEV was based in the in 
vitro ligation of six adjoining cDNA subclones 
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Fig. 5. Cloning of the TGEV cDNA in BACs and expression 
of GFP: A. Plasmid pBAC-TGEV' (bottom plasmid) was 
generated using two plasmids, one containing all the virus 
genome (top plasmid) except a sequence of about 5 kb present 
between two Cla I sites cloned in a second plasmid (middle 
plasmid). CMV, cytomegalovirus immediate-early promoter; 
Poly(A), tail of 24 A residues; HDV, hepatitis delta virus 
ribozyme; BGH, bovine growth hormone termination and 
polyadenylation sequences; SC11, S gene of PUR-C11 strain. 
B. Expression of GFP using an infectious TGEV cDNA clone. 
Genes 3a and 3b were deleted in the TGEV infectious cDNA, 
cloned in BAC, leading to a replication competent cDNA 
(pBAC-TGEV-A3ab-GFP). GFP gene (0.72 kb) was inserted 
within the position of the deleted genes after the TRS of gene 
3a. GFP, green fluorescent protein. SC11, S gene of PUR-Cl1 
TGEV strain. An, poly A. HDV, hepatitis delta-virus ri- 
bozyme. BGH, bovine growth hormone termination and 
polyadenylation signals. 


that span the entire TGEV genome. Each clone 
was engineered with unique flanking interconnect- 
ing junctions which determine a precise assembly 
with only the adjacent cDNA subclones, resulting 
in a TGEV cDNA. In vitro transcripts derived 
from the full-length TGEV construct were infec- 
tious (Yount et al., 2000). 

More recently, an infectious cDNA clone of 
HCoV-229 has been reported (Thiel et al., 2001). 
This system is based upon the in vitro transcrip- 
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tion of infectious RNA from a cDNA copy of the 
human coronavirus 229E genome that has been 
cloned and propagated in vaccinia virus. In this 
case, as when the BACs were used to assembly the 
coronavirus infectious cDNA, the full-length 
coronavirus genomes can be modified and propa- 
gated as single cDNA molecules, in contrast to the 
infectious cDNA clone derived from six fragments 
that are in vitro ligated. 

The engineered cDNAs will have an important 
impact on the study of mechanisms of coronavirus 
replication and transcription and provide an in- 
valuable tool for the experimental investigation of 
virus-host interactions. These cDNAs may also be 
the basis for tissue-specific expression systems that 
may be used in human, porcine, canine and feline 
species by replacing the S gene included in the 
cDNA with that of the coronavirus infecting the 
target species. 


6.3. Cloning capacity of coronavirus expression 
vectors 


Coronavirus minigenomes have a theoretical 
cloning capacity close to 27 kb, since their RNA 
with a size of about 3 kb is efficiently amplified 
and packaged by the helper virus and the virus 
genome has about 30 kb. In contrast, the theoret- 
ical cloning capacity for an expression system 
based on a single coronavirus genome like TGEV 
may be 3.1 kb taking into account that: (i) the 
non-essential 3a and 3b genes (1.0 kb) have been 
deleted; (11) the standard S gene can be replaced by 
that of PRCV mutants with a deletion of 0.67 kb; 
and (iii) both DNA and RNA viruses may accept 
genomes with sizes up to 105% of the wild-type 
genome (Bett et al., 1993; Parks and Graham, 
1997; Afanasiev et al., 1999). This cloning capac- 
ity most likely will be enlarged in the near future 
when non-essential domains of the replicase gene, 
with more than 20 kb, could be identified. The 
present cloning capacity of the coronavirus vec- 
tors (around 3 kb) is within the range expected, 
since other RNA virus vectors, such as those 
derived from the VEEV, with a genome of around 
12 kb, accept inserts of about | kb in size that are 
expressed during ten serial tissue culture passages. 
Larger genes in the order of 2.5 kb can also be 


expressed by the vector but are not tolerated as 
well by these viruses (Caley et al., 1997). Accord- 
ingly, in SFV and Sindbis virus expression system 
recombinants with smaller inserts (e.g. <2 kb) 
tend to be more stable than those with larger 
inserts, such as the B-galactosidase gene (about 3 
kb) (Bredenbeek and Rice, 1992). 

The longest cassette expressed using the Sindbis 
virus was 3.2 kb in length, which is remarkable for 
such a small vector (Sindbis genome is roughly 
11.8 kb in length) and indicates that, apparently, 
the Sindbis virion is a rather flexible structure 
capable of accepting RNA molecules which are at 
least 30% larger than its own genome. Neverthe- 
less, it should be noted that after the third pas- 
sage, barely detectable amounts of the expression 
products were observed in cells infected with all 
recombinants and after the fifth passage, no het- 
erologous proteins were detected in infected cells 
(Pugachev et al., 1995). 

In VEEV, as well as in the Sindbis system, 
larger genes have been shown to retard virus 
growth (Pugachev et al., 1995). The type of gene 
being expressed may also play an important role 
in limiting replication, for example, VEEV vectors 
expressing glycoproteins typically do not replicate 
to titers as high as those expressing cytoplasmic 
proteins (Caley et al., 1999). 


7. Regulation of transcription 


Most of the information on coronavirus tran- 
scription has been generated using a helper depen- 
dent expression system based on minigenomes 
encoding new subgenomic mRNAs (Makino et 
al., 1991; van der Most et al., 1994; van der Most 
and Spaan, 1995; Krishnan et al., 1996; Penzes et 
al., 1996; Lai and Cavanagh, 1997; Izeta et al., 
1999). 

Coronavirus mRNAs have a leader sequence of 
65-98 nucleotides at their 5’ ends, which is 
derived from the 5’ end of the genomic RNA. At 
the start site of every transcription unit on the 
viral genomic RNA, there is a CS that is nearly 
homologous to the 3’ end of the leader RNA. This 
sequence constitutes part of the signal for subge- 
nomic mRNA transcription. 
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Coronavirus RNA synthesis occurs in the cyto- 
plasm via a negative-strand RNA intermediate 
which contains short stretches of oligo(U) at the 
5’ end. Both genome-size and subgenomic nega- 
tive-strand RNAs, which correspond in number 
of species and size to those of the virus-specific 
mRNAs have been detected (Fig. 2B). The subge- 
nomic negative-strand RNA sequences appear to 
be complementary to the positive-strand subge- 
nomic mRNAs. 

The common 5’ leader sequence is only found 
at the very 5’ terminus of the genome, which 
implies that the synthesis of subgenomic mRNAs 
involves fusion of non-contiguous sequences 
(Baric et al., 1983; Spaan et al., 1983; Lai et al., 
1984). To explain the synthesis of leader-contain- 
ing subgenomic mRNAs, two models: the leader- 
primed transcription (Lai, 1998), and _ the 
discontinuous transcription during  negative- 
strand RNA synthesis (Sethna et al., 1989; Saw- 
icki and Sawicki, 1990; van Marle et al., 1999), 
compatible with most of the experimental data, 
have been proposed. The leader-primed transcrip- 
tion model proposes that the virion genomic 
RNA is first transcribed into a genomic-length 
negative-strand RNA which, in turn, becomes the 
template for subsequent subgenomic mRNA syn- 
thesis. The leader is transcribed from the 3’ end of 
the negative-strand genomic RNA and dissociates 
from the template to subsequently associate with 
the template RNA at the various mRNA start 
sites serving as a primer for the transcription of 
the viral subgenomic mRNAs. It is proposed that 
the discontinuous transcription step takes place 
during positive-strand RNA synthesis. The dis- 
continuous transcription during negative-strand 
RNA synthesis model (Fig. 6A—B) proposes that 
the discontinuous transcription step occurs during 
negative-strand RNA synthesis, generating subge- 
nomic negative-strand RNAs, which then serve as 
templates for subgenomic mRNAs in interrupted 
transcription. In this model, at the TRS on the 
genomic RNA the nascent subgenomic negative- 
strand RNA jumps to the leader RNA sequence 
at the 5’ end of the genomic RNA to act as a 
primer for transcription in a process that may be 
helped by cellular and viral proteins (Fig. 6B). 


These models are not mutually exclusive, as 
components of each model may operate at differ- 
ent stages of the viral replication cycle. Neverthe- 
less, recently more high quality experimental 
evidence is being generated that supports the sec- 
ond model (van Marle et al., 1999; Baric and 
Yount, 2000). 


7.1. Transcription regulatory sequences 


The TRSs, including the CS regions, are short 
sequence elements upstream of the transcription 
units. Because the leader-mRNA junction occurs 
within this CS sequence or its minus-sense coun- 
terpart (cCS) the CSs are considered to be crucial 
for mRNA synthesis. The sequence of the cCS 
probably influences transcription throughout at 
least two types of recognition events: (1) one re- 
lated to the potential basepairing between the 
leader 3’ end, complementary to the cCS, that 
guides the fusion between the leader and the body 
of the mRNA (Fig. 6A—B); and (ii) the recogni- 
tion of the primary or secondary structure within 
the neighborhood of the cCS, that may cause 
formation of a bridge between the CS and the 3’ 


Fig. 6. Schematic representation of a coronavirus discontinu- 
ous transcription model and the RNA structures involved: 
A-B. Discontinuous transcription during negative-strand 
RNA synthesis without (A) and with (B) a schematic represen- 
tation of protein-RNA complexes potentially involved. During 
the negative RNA strain synthesis (discontinuous line) the 
replication complex is detached when the CS sequence is 
reached, and the complex joins the 3’ end of the leader. 
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Fig. 7. CSs of coronavirus TRSs: A. Alignment of representative CSs of the three groups of coronaviruses and one arterivirus. B. 
Relative abundance of the mRNAs produced after the mutagenesis of CSs of different lengths (10, 17 and 18 nt) using the 
MHV-AS9 strain derived minigenomes (modified after van der Most et al., 1994). 


end of the leader, mediated by protein—protein 
and protein-RNA interaction (Lai, 1998). Ac- 
cording to this model, the cCS should act as a 
classical promoter where transcription 1s initiated. 
Alternatively, this sequence may slow down or 
even detach the transcriptase complex, according 
to the discontinuous transcription during negative- 
strand RNA synthesis model (van der Most and 
Spaan, 1995; Sawicki and Sawicki, 1998; van 
Marle et al., 1999). 

The CS of coronaviruses belonging to groups I 
(hexameric 5’-CUAAAC-3’) and II (heptameric 
5’-UCUAAAC-3’) share homology, whereas the 
CS of coronaviruses belonging to group III, like 
that of IBV have the most divergent sequence 
(5'-CUUAACAA-3’). Also, arterivirus CSs have a 
sequence (UCAAC) that partially resembles that 
of IBV. Thus, the CS of different coronaviruses 
are quite similar though different in length (Fig. 
TA). 


7.2. The extent of the basepairing could in part 
determine mRNA levels 


The potential basepairing between the 3’ end of 


the leader and the cCS differs slightly among the 
different genes of coronaviruses. For MHV, the 
extent of the basepairing ranges from 9 to 18 bp. 
Every MHV cCS contains the sequence 3’-AGAU- 
UUG-S’, or a closely related sequence (van der 
Most and Spaan, 1995). Cloning short oligonucle- 
otides ranging from 10 to 18 nt, comprising the CS 
sequences in MHV minigenomes, showed that 
these sequences alone were sufficient to direct 
subgenomic DI RNA synthesis (van der Most et 
al., 1994). A series of deletions in the sequences 
flanking the cCS reduced mRNA production, 
demonstrating that the sequence flanking the CS 
3'-AGAUUUG-%S' affected the efficiency of subge- 
nomic DI RNA transcription (Makino et al., 1991; 
Joo and Makino, 1992; Makino and Joo, 1993; 
van der Most et al., 1994). TRS activity 
became optimal in MHV when a sequence of 18 nt 
from a region showing full complementarity to the 
3’ end of MHV genomic leader was used (Shieh et 
al., 1987; Makino et al., 1988; La Monica et al., 
1992). 

In MHV TRS strength is affected only slightly 
when a single nucleotide is mutated (Fig. 7B) (Joo 
and Makino, 1992; van der Most et al., 1994). 
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Exceptionally, substitutions in some positions re- 
sult in a more than 10-fold reduction of transcrip- 
tion. An increase in the TRS length, that leads to 
an increase in the potential hybridization with the 
3’ end of the leader, allows the introduction of 
more than one mutation without a significant 
decrease in the mRNA production (Fig. 7B). In 
TGEV, the presence of the CS 5’-CUAAAC-3’ led 
to mRNA transcription, but deletion of the 
‘U’ or a change in the second ‘C’ led to the 
complete abrogation of mRNA transcription 
(Alonso et al., 2001a). These data suggest that 
transcription initiation require a duplex of certain 
minimum stability. Once this condition is met, 
extending the basepairing does not increase TRS 
strength. 

Using TGEV derived RNA minigenomes, we 
have shown that the CS sequence 5’-CUAAAC-3’ 
is required and is sufficient for high expression 
levels providing that it is in the appropriated 
context (Alonso et al., 2001la). Nevertheless, the 
mRNA and protein expression levels are highly 
influenced by the sequences flanking at the 5’ and 
3’ sides of this CS sequence. The addition of 5’ 
upstream sequences from the TGEV N gene to 
the CS (from 6 to 94 nt) led to an increase in the 
total mRNA transcription of up to 4-fold. The 
sequences 3’ downstream of the CS also led to 
4-fold higher mRNA levels in TGEV (Alonso et 
al., 2001a). Similarly, in IBV, expression of the 
reporter gene was under the canonical octameric 
IBV CS sequence CU(U/G)AACAA (Stirrups et 
al., 2000). 

In Arteriviruses, it has been shown that discon- 
tinuous transcription takes place during the syn- 
thesis of the negative RNA strand (van Marle et 
al., 1999). Using site-directed mutagenesis of an 
infectious cDNA clone of the EAV it has been 
shown that subgenomic mRNA (sgmRNA) syn- 
thesis requires basepairing interaction between the 
leader TRS and the complement of a body TRS 
in the viral negative strand (Fig. 6). EAV TRS 
core consists of the CS _ pentanucleotide 5’- 
UCAAC-3’. It has been shown that mRNA syn- 
thesis is probably governed by a_ direct 
basepairing interaction between the plus leader 
TRS and the complement of the TRS. Using TRS 
mutants with reduced transcriptional activity, evi- 


dence was obtained showing that the TRS se- 
quence at the leader-body junction of the ssRNA 
is derived from the body TRS. This finding sup- 
ports the idea that ssmRNAs are generated by a 
mechanism of discontinuous minus strand synthe- 
sis (van Marle et al., 1999). The sequences at the 
3’ end of the leader complementary to the TRS 
core are part of a single-strand RNA loop helping 
its potential interaction with the cCS. In TGEV, 
the equivalent sequences are also predicted to be 
mostly within a single-strand RNA loop at the 3’ 
end of the leader. 


7.3. Effect of core sequence copy number on 
transcription 


Studies on coronavirus transcription were per- 
formed using more than one CS in order to 
express the same mRNA. Using BCoV defective 
RNAs, with one to three heptameric canonical 
TRS ‘UCAAAC’ separated by 20 nt in the 
tandem repeats, it was observed that although 
transcription initiation occurred at each of the 
three CS sites in the tandem construct, almost all 
the transcripts were found as a product of the 
most downstream CS (Krishnan et al., 1996). 
Nevertheless, the accumulated amounts of subge- 
nomic mRNA remained nearly the same for the 
three constructs with one to three CSs because the 
minigenome with three CSs was replicated with 
lower efficiency than those with two or one CS 
copy. Similarly in IBV, expression of CAT under 
the control of TRSs composed of two tandem 
repeats of the canonical octameric IBV CS se- 
quence ‘CUUAACAA’, showed that either CS 
sequence can function as acceptor sites for 
mRNA synthesis but transcription preferentially 
occurred at the 3’-most TRS (Stirrups et al., 
2000). 

In other cases, several copies of the CS were 
inserted within the same minigenome to express 
more than one RNA. Insertion of two CS copies 
within an MHV defective RNA resulted in the 
decrease of the 5’ upstream mRNA if the CSs 
were separated by 124 nt (Joo and Makino, 1995). 
In another study (van Marle et al., 1995), in 
which combinations of up to three CSs separated 
by 361—761 nt were inserted, it was shown that 
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the position of the CS affected the amounts of 
mRNAs produced, in spite of the large distance 
between CSs. 

Both sets of experiments resulted in a negative 
effect on the transcription of upstream CSs by 
downstream ones. The observation that the most 
3’ CS is preferentially used is consistent with the 
coronavirus discontinuous transcription during 
the negative-strand synthesis model (Sawicki and 
Sawicki, 1990). However, this model does not 
explain all the observations made in coronavirus 
transcription. In FIPV and TGEV, the shortest 
mRNAs are produced in lower quantities than the 
next larger mRNA encoding the nucleocapsid 
protein (de Groot et al., 1987b; Sethna et al., 
1989; Penzes et al., 2001). This result strongly 
suggests that mRNA abundance is influenced by 
the presence of additional regulatory signals. 


7.4, Expression using an internal ribosomal entry 
site 


Expression under IRES in coronaviruses has 
been documented for mRNA3 of IBV (Liu and 
Inglis, 1992; Le et al., 1995) and for mRNA 5 of 
MHV (Leibowitz et al., 1988; Thiel and Siddell, 
1994). CAT expression has been shown using an 
internal ribosomal entry site (IRES) sequence of 
the encephalomyocarditis virus in the MHV sys- 
tem (Lin and Lai, 1993) (Fig. 3). CAT activity 
20-fold higher than when using control plasmids 
was detected in passages 0 and 1, but decreased in 
passage 2. Using the Sindbis virus, CAT expres- 
sion levels with the IRES of encephalomyocarditis 
virus are about 5-fold lower than when expression 
is driven via a second subgenomic promoter (Bre- 
denbeek and Rice, 1992). The combination of 
coronaviruses TRS and IRES could be useful for 
the construction of bicistronic vectors. 


7.5. Expression system stability and insert size 


Expression from MHV defective RNAs of 
CAT, HE and murine IFN-y genes was not ob- 
served beyond passages 2, 3 and 4, respectively. 
Using minigenomes derived from TGEV and IBV 
expression was more stable but highly dependent 
on the nature of the heterologous gene used. 


Luciferase expression with TGEV and IBV 
minigenomes was reduced to almost background 
levels, while expression of GUS or CAT using 
TGEV or IBV derived minigenomes, respectively, 
was observed for about 10 passages (Izeta et al., 
1999; Stirrups et al., 2000). 

The expression of GUS or PRSSV ORF5 using 
TGEV minigenomes was increased until passage 
three, leading to a single new mRNA correspond- 
ing to the GUS or the ORF5 mRNAs (Izeta et al., 
1999; Alonso et al., 2001b). Expression levels were 
maintained for 8 passages, but new mRNA bands 
of a size lower than those of the full-length GUS 
or ORF5 mRNA were observed at passage 5, 
steadily decreasing during successive passages 
(Alonso et al., 2001a,b). 

In general, the insertion of a heterologous gene 
such as GUS into TGEV derived minigenomes led 
to a 40—50-fold reduction in the levels of the 
minigenome RNA (Alonso et al., 2001a). The 
limited stability of the helper dependent expres- 
sion systems is most likely due to the foreign gene 
since TGEV minigenomes of 9.7, 3.9 and 3.3 kb, 
in the absence of the heterologous gene, are am- 
plified and efficiently packaged for at least 30 
passages, without generating new dominant 
subgenomic RNAs (Méndez et al., 1996; Izeta et 
al., 1999). The recombination frequency in MHV, 
TGEV, and IBV seems inversely proportional to 
the stability of the recombinants expressing a 
foreign gene. In fact, the MHV based helper 
dependent expression system has the lowest stabil- 
ity probably because of the higher recombination 
frequency within this system (Lai, 1996). 

The stability of the expression system is also 
conditioned by the type of polymerases involved 
in amplification of the minigenome and transcrip- 
tion of the mRNA (Agapov et al., 1998). The 
accumulation of mutations during the in vitro 
expression of minigenome RNAs with T7 DNA- 
dependent RNA-polymerase is 10~4-10~° con- 
sisting mostly of l-nt insertions and deletions 
(Boyer et al., 1992; Sooknanan et al., 1994). After 
transfection of the in vitro produced RNA, syn- 
thesis of mRNA by the viral RNA-dependent 
RNA-polymerase should have an accumulation of 
mutations with a relatively higher frequency of 
10~3-10~4 (Ward et al., 1988; de Mercoyrol et 
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al., 1992). An improvement in expression stabil- 
ity should be observed by using expression sys- 
tems initiated by DNA transfection, such as 
those based on the expression of the 
minigenomes under CMV promoter, since in 
these cases an eukaryotic RNA polymerase II 
expresses the minigenome with an estimated er- 
ror frequency of 5 x 10~° on synthetic polynu- 
cleotide substrates (de Mercoyrol et al., 1992). 
In addition, the eukaryotic RNA polymerase II 
has additional mechanisms to insure even more 
accurate transcription (Thomas et al., 1998). 


8. Modification of coronavirus tropism 


Driving vector expression to different tissues 
may be highly convenient in order to preferen- 
tially induce a specific type of immune response, 
1.e., mucosal immunity by targeting the expres- 
sion to gut-associated lymph nodes. In addition, 
it seems useful to change the species-specificity 
of the vector to expand its use. Both tissue and 
species-specificity have been modified using 
coronavirus genomes. 

Group 1 coronaviruses attach to host cells 
through the S glycoprotein (Holmes and Lai, 
1996) by interactions with aminopeptidase N 
(APN) which is the cellular receptor (Delmas et 
al., 1992; Yeager et al., 1992; Tresnan et al., 
1996; Benbacer et al., 1997; Kolb et al., 1997). 
Interestingly, while porcine and human 
aminopeptidases show species-specificity, the fe- 
line aminopeptidase seems to serve as a receptor 
for feline, canine, porcine and HCoV (Tresnan 
et al., 1996; Benbacer et al., 1997). Group 2 
coronaviruses use the carcinoembryonic antigen- 
related cell adhesion molecules (CEACAM) as 
receptors (Beauchemin et al., 1999). The S 
protein is also responsible for cell entry of 
group 2 coronaviruses such as MHV. In these 
viruses, the S glycoprotein is cleaved into two 
covalently associated 90-kDa_ subunits, the 
amino-terminal S1 and the carboxy-terminal S2 
subunits (Frana et al. 1985; Luytjes et al., 
1987). It is believed that the S1 subunit forms 
the membrane-bound stalk portion (de Groot et 
al., 1987a). A receptor binding activity has been 


demonstrated in studies using recombinant 
protein containing the amino-terminal 330 
residues of the S1 subunit of MHV-JHM (Kubo 
et al., 1994). 

Engineering the S gene can lead to changes 
both in the tissue (Ballesteros et al., 1997; Lep- 
arc-Goffart et al., 1998; Sanchez et al., 1999) 
and species-specificity (Kuo et al., 2000). TGEV 
enteric or respiratory tropism is conditioned by 
the primary structure of the S gene (Ballesteros 
et al., 1997). The S glycoprotein domain recog- 
nized by the cellular receptor (pAPN) on ST 
cells is located within the globular domain of 
the protein encoded between nucleotides 1518 
and 2184. This domain is present both in enteric 
and respiratory porcine coronaviruses, indicating 
that its presence in a virus is not in itself suffi- 
cient to infect the enteric tract. In fact, it has 
been demonstrated that a second factor mapping 
in the S gene around nucleotide 655 drastically 
influences the enteric tropism of the PUR46 
strain of TGEV (Ballesteros et al., 1995; 
Sanchez et al., 1999). 

The tissue-specific tropism of TGEV has been 
modified by constructing recombinant viruses in 
which part of the S gene from an isolate with 
an exclusively respiratory tropism has been sub- 
stituted by the homologous S gene domain of 
an enteric virus strain. This has been done ei- 
ther by targeted recombination (Sanchez et al., 
1999) or by engineering an infectious cDNA 
clone (Almazan et al., 2000). Studies on the 
tropism of selected recombinants obtained in 
these studies confirmed the need for a second S 
gene domain, distal from the pAPN binding do- 
main, for enteric infection by TGEV. 

The species-specificity of coronaviruses has 
also being modified by targeted RNA recombi- 
nation. A mutant of the MHV in which the 
ectodomain of the S glycoprotein was replaced 
by the high divergent ectodomain of the S 
protein of FIPV, resulting in a chimeric virus, 
acquired the ability to infect feline cells and 
simultaneously lost the ability to infect murine 
cells in tissue culture (Kuo et al., 2000). The 
change of tropism opens up the possibility of 
engineering coronaviruses to target the desired 
species. 


L. Enjuanes et al. / Journal of Biotechnology 88 (2001) 183-204 199 


9, Conclusions 


Both helper dependent expression systems, 
based on two components, and single genomes 
constructed by targeted recombination, or by us- 
ing infectious cDNAs, have been developed. The 
sequences that regulate transcription have been 
characterized mainly using helper dependent ex- 
pression systems and it will now be possible to 
validate them using single genome systems. Using 
helper dependent expression systems, production 
of high amounts of heterologous antigens (2-8 
ug/10° cells) has been achieved, and the synthesis 
has been maintained for around 10 passages. 
These amounts have been sufficient to elicit strong 
Immune responses. 

The genome of coronaviruses has been engi- 
neered either by targeted recombination or by 
modification of cDNAs encoding infectious coro- 
navirus RNAs. In some cases (TGEV), a foreign 
gene (GFP of 0.72 kb) has been efficiently ex- 
pressed during at least 20 passages (Fischer et al., 
1997; Sola et al., 2001b). Thus, a new avenue with 
a great deal of potential has been opened for the 
coronaviruses with a long genome size and enteric 
tropism, this makes them of high interest as ex- 
pression vectors for vaccine development and, 
possibly in the future, also for gene therapy. The 
possibility of engineering the tissue and species 
tropism will make coronaviruses very flexible ex- 
pression systems, since the same vector can be 
modified to target expression to different organs 
and animal species, including humans. 
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